- ZeRO-2, gradient partitioning across GPUs.
- ZeRO-3, parameteter partitioning across GPUs.